DOCUHBHT BSSUflE 



BO 104 284 



SB 006 411 



AOTHOB 
TITLB 

INSTT.TnTION 
SfOSS AGENCY 

PUB DATE 

GRANT 

NOTE 

EDSS PBICE 
DESCBIPTOBS 



HcKeachle, 8« J. ; Lin, T. G. 

Ose of Student Batings in Evaluation of College 
Teaching. Final Beport. 
Michigan Univ., Ann Arbor. 

National Inst, of Education (OHEH) , Hashington, 

D.C. 

Har 75 

NS-G-00-3-011C 
41p. 

HP-$0.76 HC-$1.95 PLUS POSTAGE 

College Students; Educational Xiprovenent; '^'Effective 
Teaching: Faculty; ^Higher Education: Bating Scales; 
Besearch Methodology: Besearch Projects; Statistical 
Data; "CStudent Attitudes; Teacher Behavior; ^Teacher 
Evaluation; Teacher Improveaent; Teacher Promotion; 
♦Teacher Bating 



ABSTBACT 

student ratings have been used for three major 
purposes — in each case by a different group: (1) to assist teachers 
to improve their teaching; (2) to aid administrative decisions with 
respect to promotions or salary increases of teachers; and (3) to 
provide descriptions of course and teachers for students choosing to 
enroll in courses or sections of courses. The three studies described 
in this report relate to the first two of these purposes: using 
student ratings to improve teaching: an experimental investigation of 
factors affecting university promotion decisions; and do 
discrepancies between student ratings, teacher expectations, and 
teacher ideals result in changes in teacher behavior? For each of the 
studies, the hypothesis, method, procedures and results are 
indicated. (MJM) 



Use of Student Ratings in Evaluation 

of College Teaching 



Final Report 



March 1975 



National Institute of Education 
Grant No. NE-G-00-3-0110 



us DL PAHTMENT OF HEALTH. 
EOVICATION& WELPAHE 
NATIONAL INSTITUTE OF 
EDUCATION 

Di.A t D IXA(. rvv MFCt'vLD R 
THk J'l W'.ON OW OHf>ANi/ATlON Qt^'OlN 
ATiNi,.T POiNT'iOf v-LA OW OP»N«0NS 
ST^.Tfij DO Nv^^ NtCrSSAHilY WLf-Wt 
M OF r»« lAi NATiONAl iNSTiTuTF Of 
( D.iC At ICS PO-,1 T.ON O** POl iC Y 



1 1 



FINAL REPORT 
(Jrant No. Nli'.-G-OO-J-OlLO 



USE OF STUDENT RATINGS IN EVALUATION OF COLLEGE TEACHING 



W. J. NkiKeachie and Y. G. Lin 



Researchers 

Monica Daugherty 
Mary M. Moffett 
Cynthia Neigler 
John Nork 
Martina Walz 



The University of Michigan 
Ann Arbor, Michigan 



March l?^? 



NATIONAL INSTITUTE OF EDUCATION 



ERIC 



3 



TABLE OF CONTIjUTS 



Introduction 
Method 

Sample 

Measures 

Procedure 
Results and Discussion 



111 



Page 

LIST OF TABLES 

BACKGROUND ^ 

STUDY I: USING STUDENT RATINGS TO 1.MPH0VK TEACHING 2 

p 

Hypothesis ^ 
Measures 

Student perception of teaching and learning J> 

Psychological thinking, 5 

Attitudes 5 

Cia-iosity ^ 

Sample j* 

Procedures ^ 

7 

Results ' 
Conclusion ^ 

STUDY II: AN EXPERIMENTAL INVESTIGATION OF FACTORS AFFECTING 

bUIVERSITY PROMOTION DECISIONS H 

introduction 
Method 
Subjects 

1 ? 

Materials 
Procedure 
Results 

Additional result: sex and department 19 

Of) 

Discussion 

STUDY III: DO DISCREPANICES BETWEEN STUDEI'IT RATINGS, TEACHER 
EXPECTATIONS, AND TEACHER IDEALS RESULT IN CHANGES 
IN TEACHER BEHAVIOR? 21+ 



2k 
25 
25 

25 

26 

26 



REFERENCES 29 
APPENDIX 



ERIC 



LIST OF TABLES 



Table 
L. Student Sample 

2* Distribution of Mean Pre-Te:.'*t l.>tnd<=»ni. iiaMnffis on 
"Impact of Course*' 

^. Effects of Feedback on Teaohinp; Fffectivenesr* : 
P'actor Scores of Student Perceptions of Teachers 

h. Effect of Feedback on Teachin*^ Effectiveness, 
as Measured by Student Performance 

Post -Course Student Ratin^;s of Impact for Teaclir^rs 
Differing on Initial Ratings 

* . Effects on Feedback on Low, Middle, and Hir,hLy 
Rated Teachers 

Combinations of Teaching Ability and Research Productivity 
Used in the Exper Lmental Materials 

8. Subjects' Judgments About the Realism of the Materials 

. 9. Percentage "Yes'' Decisions at Each Level of Teaching 
Ability €ind Research Productivity 

10. Mean Salary Increases at Each Level of Teaching Ability 
and Research Productivity 

11. Mean Ranks of Promotability 

12. A^iaiyses of Candidate's Sex and Academic Department 
as Factors Influencing Decisions 

13. Quantitative Estimates of the Amount of Research Emphasis 
in Promotion Decisions 

1^4 . The Effect of Discrepancy Between Student Pre-Test Rating, 
Teacher Expected, and Ideal Ratings on Change of Stude'-^t 
Ratings from Pre-Test to Post-Test 



iv 



BACKGROUND 



Student rntings have been used for three ruRjor purposes-— in each case by 
'1 different group: 

1, to assist tpMchers in improving t.heir teaching; 

to aid administrative dec1j?ions with respect to 
promotions or salary increases of teachers; and 

c. to provide descriptions of courses and teachers 
for students chooning to enroll in courses or 
5?ections of courses. 



The b^sic premise of the studies proposed herein is that different sets of 
items ^re useful for each of these three purposes. This report describes 
studies relevant to the first two of these purposes. 



STUDY I: USIKO STUDI-n^T RATINGH TO IMPROVE TEACH INf. 



So far as we know no studies have tested the usefulness of student raiin/?s 
for all three purposes. There have, however, been studies relevant to one part 
of the problem—the use of student ratings to affect teacher behavior. 

In a study of effects of j'eedbach of pupil rn I. ings on elementary school 
teachers' behavior, Gage, Runkel, and Chatterjer- (1963) found that s.ixth 
grade tr.ac.hers changed their behavior in the direction of pupil's descriptions 
of the ideal teacher. Tuchrnan and Oliver (lv68) found at the high school level 
that feedback from student ratings changed teachers' behavior significantly 
as compared with teachers not receiving feedback. Miller (1971), however." 
most recently found no significant effect of feedback of midsemester college 
.''tudent ratings on the end-of -semester ratings of graduate student teaching 
assistants. He found, however, that mean final examination scores werp higher 
for those students whose instructors received feedback of student ratings 
during the semester than those students whose instructors did not receive 
feedback in two of three courses. Miller concluded that "...the results sug- 
gest some limitations in the use of student ratings as a method of improving 
instruction." Centra's results (lyY^O support this conclusion. 

A previous finding from our own research program helps explain the Miller 
results and points toward further research needed. In this study Pambookian 
(l'>72) also found little overall effect of feedback of student ratings, but 
detailed analysis of his data indicated that significant positive changes did 
occur for those instructors in the mid-third of the distribution of initial 
ratings. Pambookian suggests that those instructors in the top third had little 
need to improve while those in the bottom third may have become more anxious 
and defensive. 

Miller provided feedback on a 15-item scale of which only 10 items dealt 
with instructor characteristics. We believe that more specific feedback on a 
large number of items is more likely to be helpful than feedback on a smaller 
number of more general items. We believe that the effect of feedback will be 
specific and is more likely to be apparent if separate measures on several 
dimensions are used rather than a single global measure. 



HYPOTHESIS 

Our hypothesis was that teachers given personal feedback of student ratings 
of instruction with encouragement and suggestions for improvement during a 
tern? will be superior on end-of -term measures of teaching effectiveness to 
teachers receiving printed feedback or those receiving no feedback. 



7 



student Perception of Teaching nnd Learning 

Thfe primary measure was the Michigan Student Perception of Teaching and 
Lea mine form. The Michigan Student Perception of Teaching and Learning form 
has evolved over a period of 20 years in which items from the ma^Jor student 
rating of teaching forms in use in the 1950' s were factor analyzed to obtain 
items best representing the major dimensions used by students in rating 
teacher? (Isaacson, £t al* > lv6U). The form was then revised as the result 
of a :?eries of validity studies (McKeachie, et al. , l97l) and of a multiple 
discriminant analysis (McKeachie and Lin, in press). The version of the form 
u:-ed in this study consisted of 52 items. The form administered as a post-test 
differed from the pre-test form in containing: two additional items requiring 
evaluation of the instructor's general teachx.ig effectiveness and the value 
of the course as a whole. The form was administered at approximately the one- 
third point of the term and readministered with other outcome measures during 
the 11th and 12th weeks of the li^-week term. One-third of the sample did not 
receive the pre-test administration. 



Psycholopjioal Thinking 

While most other studies of feedback of student ratings have dealt with 
the effect of feedback on later student ratings , the ultimate criterion is not 
student ratings but student learning. An important second outcome measure 
therefore was student performance on items selected from the Introductory 
Fsychologj- Criteria Test (Milhollana, 1966). This test was developed by a 
Committee of the Division on Teaching of the American Psychological Associa- 
tion and has been used in much of our previous research. It provides a measure 
of achievement of the higher level cognitive outcomes of the Bloom Taxonomy. 



Attitudes 

An Attitude toward Psychology Questionnaire consisting of eight Likert- 
type items dr-awn from Carrier's scale (1966) was administered as part of the 
test rattery. Since previous use of the test had indicated that some of the 
test items appeared to involve an uncritical, naive, endorsement of psychology, 
the scoring key was changed to indicate simply agreement, disagreement, or 
undecided. Only those items were used which five introductory psychology 
teachers agreed to be appropriate in terms of the goals of introductory psy- 
chology courses. 

A locally constructed 10-item Likert scale of Attitude toward Self was 
used to assess impact of the introductory classes on the student as a person. 
In one of the four introductory psychology courses involved in the experiment, 
a scale of Attitude toward Mental Illness was being used as a device for 



ERIC 



5 

3 



^flsessinf?: the effect of student participation at a mental hospital. Thus, it 
was also available as an additional criterion measure in that course. 

Curiosity 

To test student curiosity about psychology, students wer^ informed that 
they could skip a section of the test involvinp descriptions of several experi- 
ments and a behavior ?nodif ication case. This test was scored In terms of how 
many of the studies a student read, as well as the student's rating of interest 
in the study. 

SAMPLE 

The sample consisted of 57 graduate student teaching fellows and thr^e 
faculty members recruited from the individuals teaching in the introductory 
psychology courses at The University of Michigan, Fall ITO- Of the 57 teach- 
ing fellows, 21 had had at leapt one year of previous teaching experience and 
all of the faculty members were full professors with 1^ to hO years of experi- 
ence. Twenty two of the teachers were men; l8 women. Teachers were assigned 
randomly to the three groups within each coarse. 

PROCEDURES . 

Members of the research staff administered the Michigan Student Percepticu 
of Teaching and Learning form during the last 15 to 20 minutes of a class 
period during the 5th to 7th weeks of the li^-week term. (See the Appendix 
for a copy of the form.) Wording of some items was changed to the present, 
rather than past, tense to make the form appropriate for administration early 
in the course rather than at the end of the course. Table 1 indicates the N 
in each of the classes in the sample. As the table indicates, there was fairly 
serious .shrinkage in some sections, but a sufficient number of students gave 
post-test ratings to provide reliable ratings of instruction. In each of the 
three groups one section had less than ^0% return on the post-test. One, of 
course, worries about the possible bias of the remaining sample, but some 
comfort ray be taken in the fact that these classes were equally distributed 
between the three experimental groups. 

Rating forms were scored by computer and the computer print-out presented 
the mean of the class and the mean of all classes in the same course for each 
of the seven dimensions for which the form is scored: Impact on students, rap- 
port, teacher as person, group interaction, difficulty, structure, and feedback. 

Following these factor scores were individual class and course means for 
each item listed under the heading of the appropriate factor. 



The print-outs were mailed during the ^th to 8th weeks of the term to 



TABLE 1 



L^tulents Trp-Test Pest-Test 
Kni-ol L et! N N 



V--\'^ y.: .5). 19 

r.-o-ii .^0 29 

"0 L .a c.'j ;.i5 

LVO-16 ji» P3 c'lj 

VI- ■ 16 16 7 

VL- :. 19 I't 17 

i.v;.-33 i:y 19 

iVi—'i* ^ 26 

;7i-.-^^^ ao 20 

:^ 18 15 

lV.^- 'r' 19 22' 

17.^- ^ 18 ^0 18 

r/.:-l'; 10 15 16 

•Jro'jp I L 

170- \ 28 19 19 

J 70- 6 30 22 18 

170-^0 17 lU 8 

170- 8 25 16 18 
L70-22 32 15 25 

171- o 21 2I+ 11 

in.-25 27 18 19 

171-.^? 21 19 19 

171-114 lU 17 12 

:72- 21 16 18 

l^-'- i« 2U 16 15 

.id-vd \i 15 13 

Group III 

'7'^-0 25 -- 10 

7)-^'. 16 -- 28 

1/ - 7 30 — 26 

L7'>-1,' 25 — 35 

17i-l6 26 — 22 

lYi-lV 26 — 21 

17L-:ii 23 -- 20 

171-38 26 -- 2U 

171-30 15 — il, 

171- i»l 25 lU 

172- -) lU 13 
172- 8 18 - 16 

■:ot3i3 8U5 U88 695 



id 



those ill the "Printed P'eedback" group and were returned and reviewed person- 
<\lly during' the l^th to 8th week of thir term with each teaching fellow in the 
^Toup receiving personal TeedbHck. Teachers were given their choice of re- 
ceiving feedback from their coui-se supervisor or i'rom Professor McKeachie. 
l-rofessor f-'cKerichie gave the personal feedback to 12 of the Ik teachers in this 
group and Professor Elton McNeil 'uid Judith Reitman each gave the personal 
feedback to one grndupte teaching Mssistsnt whom they were supervising. 

At the beginning of the feedback sessions t.enchers were asked to fill out 
f.^-ms indicating their expectTMon oV the student ratings on each diiriensaoii. 
their ovn seL!^"-perceptiotiS , nnd where they would like to be. Typically, Pi'O- 
fessor MoKcichie then asked them how the class was going and in respv.in.se to 
their re-ictions, :^uggested how the student ratings confirmed (or rarely did 
n^'t confirm) their perceptions. He then pointed out factors on which the 
teacher .lit't'ered significantly from the mean of all classes. If there seemed 
f j be nn.v problems, he suggested some possible alternative methods of handling 
the problem. All of the mean ratings, however, were rtlatively favorable (see 
Table ?.) . i' • that the hope that he could help teachers cope with very negative 
+>edback was not realized. 



TABLE 2 

DISTRIBUTION OF MEAN PRE-TEST STUDENT RATINGS ON "IMPACT OF COURSE" 









Personal^ 


Printed 








Feedback 


Feedback 






5.0 






h.7 














h.6 


1 


1 


k.3 




k.k 


2 


1 


h.l 




if. 2 


5 


k 


5.9 




h.o 


k 


5 


3.7 




5.8 


3 


2 


5.5 




5.6 


1 


2 


5.5 




5.U9 


_o 


JL 








I6 


16 



"Impact of Course mean consisted of mean ratings on l6 
items such as "The instructor stimulates wy intellec- 
tual curiosity" 

1 = almost never k = often 

2 = seldom 5 = very often 

3 - occasionally 6 ^ almost always 

'Th^ N in tnis table is lar^s^er than the N of teachers 
because two teachers in Group 1 and three teachers in 
Group 2 taught two sections. 



iniriiiji; Uie llt-.h nnd 12th weeks of the iJi-v/nek term ??tudent.^ were invited 
by postcMrd to attend evening ^•ossionn to take the tests measuring the depen- 
dent vr^riables including the. ^ot-^est of student perception of learning and 
teach infr. 

Upon completion of the tost:; each stn<1ent received written feedlvnoK do- * 
flcribinft the experimental desifpi and the nature of the testr> used. 

Mr> Table 5 indicates, the student ratings at the end of the term support 
the priir.Mr,v hypothesis. Group I was rated as most effective. 

Mlie effect ol' f'eedback upon effectiveness as measured by student achieve- 
ment wir not as clear cut. As Table k indicates, the hypothesis received 
star \ if''iir> sif'inif icant support in terms of student achievement on the Cri- 
tevU T^rU for Introductory Psychology for classes of Psychology Y(0 and for 
f-he riiortrnre of curiosity in Psychology I7I, but the other criteria did not 
support the hypothesis. 

Xv^e had expected the favorable effects of personal counselling to be most 
helptul to those teachers with poor ratings. Pambookian had shown printed feed- 
back to be helpful to those in the middle third of his distribution. Ve had 
expected personal feedback to help reduce the potential negative effects of 
negative feedback by reducing anxxety, by increasing motivation to improve 
through increasing hope of success, and by suggesting alternative oehaviors to 
those criticized. We thus separated the groups into thirds for further analy- 
ses. As Table 5 indicates, the results were in the direction predicted. A 
si:r.ilar analysis of the other criterion measures was also in the predicted di- 
rection for the test of thinking, the test of attitude toward psychology, and 
the test of attitude toward self, but not for the test of curiosity. None of 
these differences, however, were significant (see Table 6). 

COHCLUSION 

A3 the introduction indicated, previous studies had been rather discour- 
aging -^tbout the usefulness of student ratings in improving teaching. The re- 
sults of the present study indicate that ratings alone ce not very helpful 
but that a plan for using student ratings in counselling teachers can be 
helpful. 



TABLE 3 



EFFECTS OF FEEDBACK ON TEACH INC. EhT'ECTIVlilNl'lSS: 
FACTOR SCORES OF STUDENT PERCEPTIONS OF TFACJiERS 



Mean Class Score n 



Personal 
Feedback 



Printed 
Feedback 



No 

Feedback 



Overall Item ^ 

General teaching 
effectiveness 

Overall value of course 

Dijnenslcn ^' 

Impact :>n students^ 
(Items 1 to 16) 

}-(apport 

(Items 17 to 20) 

Teacher as person^ 
(Items 21 and 22) 

Group interaction 
(Items 23 to 25) 

Difficulty 
(It-ams 26 and 2?) 

Structure 

(Items 28 and 29) 

Feedback 

(Items 30 to 32) 



3.6 

h,17 
k,63 
2.18 
4.16 
2.23 

3.55 
k,lk 



3.1 
3.0 

3.95 
^.59 
2.60 

2.36 
3.13 
i+.09 



3.0 
3.0 

3.80 

2,k6 
k,10 

t— • (— ^ 
3.03 
3.79 



F 



^.55 
5.77 

3.60 

.17 
2.82 

.69 
.23 

1.31 
.90 



.017 
.007 

.037 
.81+6 
.072 
.506 
.797 
.281 
.i+13 



Number of sections 



13 



13 



S'^ale for overall items: 

5 = Excellent; h = Very good; 3 = Good; 2 = Fair; 1 Poor 

"Scale for dimensions: 

1 = Almost never or almost nothing U = Often or much 

2 = Seldom or little 5 = Very often 

3 = Occasionally or moderate 6 = Almost always or a great deal 

''High factor score represents high impact. 
U 

High score is over-personal. 



13 

8 

ERIC 



TABLE k 



EFFECT OF FEEDBACK ON TEACHING EFFECTIVENESS, 
AS MEASURED BY STUDENT PERFORMANCE 



Mean Class Scores 

Measure Personal Printed No Z E 

Feedback Feedback F-eedback 



Psychology 170 



Psychological thinking 


19.09 


16 M 


17.52 


5.9^1 


.02 


Attitude to Psychology- 


10.50 


9.57 


10.75 


8. 01^4 


.01 


Attitude toward self 


4.69 


U.16 


5.85 


2.567 


.12 


Curiosity 




12.41 


17.i+9 


2.58^4 


.12 


Number of sections 


k 


5 


k 







Psychology 171 



Psychological thinking 


15.65 


15.91 


15.90 


.045 


.96 


Attitude to Psychology 


10.04 


10.1+7 


9.11 


1.925 


.18 


Attitude toward self 


5.07 


k.lk 


4.57 


5.241 


.14 


Curiosity 


17.90 


9.56 


15.81 


5.557 


.02 


Number of sections 


6 




6 







Psychology 172 & 192 



Psychological thinking 


17.05 


16.61 


16.45 


.105 


.90 


Attitude to Psychology 


9.92 


11.10 


9.7^ 


.962 


.42 


Attitude toward self 


4.78 


4.92 


4.56 


.218 


.81 


Curiosity 


11.97 


16.07 


11.77 


1.208 


.55 


Number of sections 


4 


4 


5 







TABLE 5 

POST-COURSE STUDENT RATINGS OF IMPACT 
FOR TEACHERS DIFFERING ON INITIAL RATINGS 



Initial Personal Feedback Printed Feedback 



Ratings 


M 


S.D. 


N 


M 


S.D. 


N 


Low 


5.9 


.21 


5 


5.5 


.44 


4 


Mid 


4.2 


.19 


4 


4.0 


.06 


4 


High 


4.5 


.27 


5 


4.5 


.51 


5 



t = 1.02 for differences of differences 
Low vs. raid and high. 



9 



14 



P4 



I 

IS 





• 


o 


• 




CO 










(U 




(U 








o 

^— 1 




Me 



" <u 

Pi4 



05 O 



O 



CO 

'J 
•P 
CO 



• 






o 


• 




CO 


•p 




o 










05 


•p 




CO 












t 




P 




CO 






• 


a 






• 


a 


CO 


o 








Pi 




no 


a 


Or 


ol: 










Lo 



!>- l>- VO (O 

IfNVD r-4 r-l 



o oj -cr UN 
• • • • 

OJ OJ 



• • • • 

r-4 ^ 



ON O O r-l 
• • • • 

On O VD 



r-^ VD I>- CO 
• • • • 



• • • • 

On a\ O 



OJ I>- CO 

• • n 

r-l 01 



r-l OJ VO 00 



o -d* to 

r-l H 













a 












O 










r-l 










O 






CO 


•H 












O 




o 


P 


>i r-l 




•H 




CO 






p 


r-l 




CO 




o 


05 








<u 


O 


o 


O 




CO 


•H 




P 














O 




(U 


•P 


o 


H 






•H 




O 






CO 






p 


P 


O 


(U 


O 


•H 


•H 


•H 


X> 


>i 


P 


P 


^< 




CO 


P 








P4 


< 




O 





CO OJ O ON 

vo to r-H On 



O ^a; (^1 r-l 
^ i^^l CJ) 



^ r-l OJ OJ 
vD -4- 00 OJ 



r4 OJ 



rH CO CO UN 
• • • • 



UN 



O OJ o 

• • • • 

ON CTn 



H VO UN OD 
• • • • 

rH H 



o 



O H \0 ON 

• • • • 

O H rO 

UN r-l r-l 



^ I>- I>- to 
• • • • 



UN H to 

• • • • 

O O UN ro 

US H rH 



•H 

•P 



O 
H 
O 

o ^ 
CO (U 

CO 



o5 

O O 

•H p 

W) 

O (U (U 
fH 

o d 

-c P 

>i p 

'0 p 



o 
p 



P 

•H 
-P 

P 



(0 

o 

•H 

P 
O 

(U 
W 

43 o 
•H 

CO ^ 
O O 
•H X) 

51 



-it 



ON OJ 



UN 



o 
hi 

si 

•H 



VX) O -4- 
• • • • 

OJ 



H VO VO -4" 
• • • • 

O ON ^ KN 
UN H 



KN rH UN UN 
• • • • 

r-l hfN 



I>- to 00 O 
• • • • 

On o ^ rO 



^ to ^ O 
• • • • 



UN 



to OJ 00 ON 



O O -4- 

UN r-l 




CO 

•p 
•P 

CO 

o 

p 
o 

p 

CO 
I 

P 
CO 
O 

<^ 
O 

CO 

o 



Q 
o5 

-a 

o5 

'd 



CO 



ai 



• • • 

KN r<N ^ 



^ O OJ 



UN O -iT 
• • • 

to ^ ^ 



OJ OJ OJ 



00 OJ UN 
• • • 

to -d- ^ 



o 



o 

13 



o 



10 



15 



irniDY IT: AN EXPERIMRNTAI. INVESTIGATION OF FACTORS AFFECTING 
UNIVERSITY PROMOTION DECISIONS 



INTRODUCTION 

Current pressures for use of ijtudent ratings focus heavily on their use 
as evidence on teaching effectivenea;; for decisions about faculty proitiot.ions 
or salary increases. Although data on the validity of student ratings are 
important in determining whether student ratings should be used, we have no 
data on h^yv they actually influence decisions. We need to begin to determine 
what* kind of contribution student ratings can make to such decisions. One 
first step is to determine what information decision makers use from reports 
of litudeat ratinv^s. The exploratory study here reported provides data rele- 
vant to the following questions: 

Does the type of information about teaching competence 
affect promotion decisions? 

What are the relative weights of teaching and research 
on promotions? 

The present report describes an initial attempt to investigate empirically 
the factors involved in the decisions made by university promotion committees 
concerning the promotion of assistant professors to the rank of associate pro- 
fessor. For purposes of simplicity, only two major factors were considered 
in the present study— teaching ability and research productivity. The study 
was designed not only to give information about the effect of these variables 
on promotion but also to determine whether the methodology employed would be 
capable of providing relatively precise estimates of the relative emphasis 
on teaching ability and research productivity in salary and promotion decisions. 

One of the primary issues of interest in the study was whether or not 
the type of information provided in the evaluation of a promotion candidate's 
teaching ability would affect the promotion decision. The two types of eval- 
uative information that were investigated were: (a) the department chairman's 
subjective report of the candidate's teaching ability, and (b^ a summary of 
the student evaluations of the candidate's teaching ability for each of the 
courses taught by the candidate in the past two years. Although an increasing 
number of colleges and universities have been using student evaluations to 
aid in the assessment of the quality of teaching, very little is known about 
whether or not this information is utilized at the level of promotion deci- 
sions. Thus the problem has practical importance. 

Because of the design of the study, two supplementary issues could also 
be Investigated. These were whether or not the sex and academic department 
affect the weighting of research and teaching in promotion decisions. The 
sex issue is of interest in light of recent Affirmative Action programs, and 



11 



16 



disputes about the existence, or extent of, sex discrimination in university 
promotion decisions. The academic department of the candidate was manipulated 
primarily to increase the generality of any results. 



Twenty senior faculty members at The University of Michigan were asked to 
judge the promotability of cases presented to them as described below. All 
Judges were either currently members of, or had previously served on, Univer- 
sity of Michigan promotion committees. No one refused the request, but one 
judge did not return his ratings. 



Six fictitious individuals were created and their case histories were pre- 
pared in the same format as that used for recommending actual candidates for 
promotion at The University of Michigan. The biographical and educational infor- 
mation for the six candidates was devised in such a way that all candidates had 
approximately equivalent backgrounds; i.e., all were approximately the same age 
(51-35 years); all had attended prestigious graduate schools; and all had been 
first appointed in September, I969. The independent variables in the stuay 
were the levels of teaching and research competence associated with each fic- 
titious promotion candidate and the type of information presented about teach- 
ing. For each '-andidi^te alternate versions of each case history were prepared 
with different levels of research productivity and/or teaching ability. The 
combination of teaching and research levels employed are presented in Table 7. 



METHOD 



Subjects 



Materials 



TABLE 7 



COMBINATIONS OF TEACHING ABILITY AND RESEARCH PRODUCTIVXTY 
USED IN THE EXPERIMENTAL MATERIALS 



Research 
Productivity 



Excellent 



Teaching Ability 



Medium 



Poor 



Excellent 
Poor 



(a) 



(b) 
(e) 



(c) 
(f) 



12 



17 



The levela along the Teaching Ability continuum were created in two ways. 
In the Chairman's report condition, the promotion candidates were rated on 
their teaching ability in purely verbal form, as conveyed by means of the de- 
partment chairman's opinions concerning the candidate's teaching ability. The 
types of phrases characteriTsing each level were: Excellent— "excellent, " 
"superior," "truly outstanding," and "about average"; and Poor— "somewhat 
below average," "not particularly impressive," and "perhaps not outstanding." 
In the student- rating condition numerical averages of the student ratings of 
teaching ability were included in the evaluations as well as the verbal phrases 
characterising each level. The student ratings were represented as being de- 
rived from a '^-point scale ranging from 1 (for excellent) to ^5 (for poor). 
The average ratings for each level were: Kxcellent — l.'pO, medium— i2.6o, and 
poor — 5.7^. These values were chosen because Uiey approximated best, medium, 
and worst ratings received by a large group of teachers over a number of years 
at The University of Michigan. Typ^^ of teachitig evaluation was treated as a 
between- subject factor of the candidate';; teaching (hereafter designated as 
chairman 's-report condition), and 10 sub/iects receiving the student- rating 
information as well as the chairman's report (hereafter referred to as the 
student-rating condition). 

The two levels of research productivity were created by varying the num- 
ber of research publications listed in the candidate's vita from an average 
of 3.1, with a range from 2 to k, for the low-productivity level, to an aver- 
age of 13-5, with a range from 11 to I6, for the high-productivity level. 
Also varied across the two levels of research productivity were the types 
of descriptive comments included in the evaluation of the research. In the 
low- productivity cases the comments included the following phrases: "not 
impressive in quantity," "not one of the most productive," "not very active," 
and "perhaps not outstanding." The comments in the high- productivity cases 
included such phrases as: "large number of high-quality articles in presti- 
gious journals," "consistently high quality," "international recognition," 
"solid scientific reputation," and "impressive in quantity and quality." 

Each of the evaluators or subjects in the study received six promotion 
caje histories to evaluate, one at each of the combinations of teaching abil- 
ity, and research productivity described above. The particular promotion can- 
didates assigned to each combination level was varied across evaluators such 
that each promotion candidate was evaluated at several levels of teaching 
ability and research productivity. 

Two other aspects of the candidates were also manipulated. These were 
the sex of the candidate — one of the six candidates presented to each evaluator 
was a female and the other five males, and the academic department with which 
the candidate was associated— Psychology for three of the candidates and Phy- 
sics for three of the candidates. The sex manipulation was achieved by pairing 
two candidates and describing one as a female and the other as a male in half 
of the case histories and then reversing this sex relationship for the remain- 
ing case histories. The academic department manipulation was achieved by 



presenting an equal numb^^r of Physics candidates as Psychology candidates to 
each of the eva2,uators. 

Special care was exercised throughout the construction of all case histo- 
ries to insure that the portrayals of the promotion candidates were as realistic 
as oould be managed within the constraints of the study. Since the realism of 
the case histories was considered to be essential for the validity of the study, 
all of the evaluators were asked tA> rate the realism of the case histories upon 
completion of their evaluation decisions. 

Because all other interpretations would be tainted if the experimental 
materials were perceived by the subjects to be unrealistic, the data concerning 
the realism judgments need to be considered before evaluating the other results. 
Only eignteen subjects responded to this item in the questionnaire, but their 
responses were encouraging, as may be seen in Table 8. 



TABLE 8 

SUBJECTS' JUDGMENTS ABOUT THE REALISM OF THE MATERIALS 

-IlVery . "Generally "Not Very "Not at All 

Realistic Realistic" Realistic" Realistic" 

Number of 

subjects 3 lit 0 1 



Several subjects offered comments about the materials, and from these the 
most serious reservation seemed to be that there were no letters from experts 
outside the department evaluating the candidate's research. Other comments 
mentioned that there was too little variation across candidates, and that there 
was little or no service on department or university committees in any of the 
candidate descriptions. In general, however, the materials were apparently 
believable and the decisions made regarding them were taken seriously. 



PROCEDURE 

Each evaluator was asked to make both a decision regarding promotion (yes 
or no) and a decision regarding the amount of salary increase (frorr; $0 to $1500 
per year) for each of six fictitious promotion candidates. Upon completion of 
the promotion and salary decisions; for all candidates, the evaluators answered 
a questionnaire containing items designed to assess the evaluators*: 

(a) rank- orde rings of the candidates in terms of their 
desirability for promotion; 



Ik 



Id 



(b) opinions of the realism of the case histories; 



(o) estimates of the relationship between teaching ability 
and research productivity in current faculty members at 
The University of Michigan; and 

(d) opinions of the most d'^'Jirabln combination of beaching 
excellence and research f^xc*^ Hence. 

The evaluators' opinions concerning tho most desirable combino tion of 
research and teaching competences for a promotion candidate were elicited in 
the following manner: First, oach evaluator was shown a graph in which the 
axo3 represented arbitrary scales of excellence such as percent i. 1ft ranks. The 
ordinate was used to indicate increases in research quality and productivity 
and the abscissa was used to indicate increases in teaching quality and effec- 
tiveness. Next, the evaluators were requested "...to draw a line on the graph, 
to enclose the region of research and teaching percentile values in which a 
candidate would be seriously considered for promotion ...( that is) ... outline a 
region that would include all the possible combinations of teaching ability or 
excellence and research ability or excellence in which a candidate possessing 
those percentiles would receive serious consideration for promotion." 

AH of the materials described above (i.e., the six case histories, the 
salary and promotion decision forms, and the final questionnaire) were mailed 
to the evaluators along with a cover letter of explanation and instruction. 
It is estimated that the evaluators took from 1 to 6 hours to complete the 
materials. The materials were returned by nineteen of the sub;Jects from one 
to eight weeks after the initial mailing. The twentieth subject failed to 
return the materials and since attem>;ts to contact him were unsuccessful, he 
was dropped from the study. 

RESULTS 

The first dependent variable to be considered is the percentage of "yes" 
promotion decisions at each level of teaching ability and research productiv- 
ity. These percentages are listed in Table 9- 



15 



TABLE 9 



PERCENTAOE "YKO" DECIP.IONS AT EACH LEVEL 
OF TEACHING ABILITY AND KEfiEARCH PROntJCTIVITY 





Teaching Ability 


Productivity 


Excellent 


Medium 


Poor 


Mean 


Excellent 


lOOfo 


80'/o 


6oi 






QQ Oct/ 






^y^^ 71. '4 


Poor 






Qffo 


16.7% 














1% 














^X'^^ 38.8% 





Note: The numbers above the diagonals are the means from the 10 subjects in 
the condition in which student ratings were reported, and the numbers 
below the diagonals are the means from the 9 subjects in the chair- 
man's -report condition. 



Discussion of the pattern of results in Table 9 will be deferred until 
after the results from the other dependent variables have been presented. 

In addition to the decisions about whether each candidate should be pro- 
moted, the subjects also made independent Judgments about the amount of salary- 
increase appropriate for each candidate, r.id the rank order of the candidates 
in terms of their desirability for promotion. Both of these sets of data wer.2 
subjected to analyses of variance, using the mean values of each cell to sub- 
stitute in the case of missing data. The three factors in the analysis of the 
salary data were: (a) the type of teaching-evaluation information; (b) the 
le/el of teaching ability; and (c) the level of research productivity. Only 
thf^ teaching- level and re search- level factors could be analyzed with the rank- 
order data since the ranks for each group and, indeed, for every subject, were 
constrained to have the same mean. The results from the two analyses of variance 
are presented in Tables 10 a. va 11, 



16 



2i 



TAPI£ 1.0 



:-va:: .saiahy iijoreask:' at m:n 

'nACHII."; AWMTi Alil) !{}<;. -.tlARCH PROOUCTI VITY 







]'»^')Chint?: Ability 






tx?*=:I Icr.t 


■■ f • 1 i 1 nil 


Poor 


Menn 




1-^ • • 


IOJ/0.0 


■'.r;.0 


1 1 0' . 0 -^■'"'''^ 


?".<celier.t 


1.H&.8 


8870 


<65.6 


10 ;7.; 








560.0 




p.; r 


6^7.5 




555. 


^^^^^ -j.5.9 ! 






755.0 








1078. 1 


665. 2 


660.6 





Analysis of Variance Results 
From the Salary and Rank-Order Data 



rependent Variable: Salary Increase 

Factor df F w£ 
lype of teaching 

evfiluation information l,l8 •O6 0 

Teaching ability 2.56 55-99* .225 

Research productivity I.18 l6U.l3* •86U 

Interaction (Type of 
teaching evaluation 
information X teaching 

ability) 1.56 ' 1.21 



♦ p < .001. 

..cte: For all other interactions F < 1,0. 

;jote: The in lex supplements the F values by providing an estimate of the proportion 
of the variance in the dependent variable accounted for by the independent vari- 
able (see Hays, 1965). 



17 



TABLE 11 



MEAN mmS OF PROMOTARILI'J'Y 



Research 
Pru'luctivity 


Teaching Ability 


Excellent 


Me'iium 


Poor 


Menn 


Hxcellent 


y 

1.00 

1.00 




... 
2.80 


2.15 


Poor 




"j. 10 


';.50 
^/"^ 5.90 


1+.87 

. 95 




2.55 

2.ii0 


i|.02 
^^./"'^ 5.7!) 


^.15 

^"^^u55 





Dependent Variable: Rank Order 

Factor ii£ £ 2^ 

Teaching ability 2.56 65. l6* .224 

Research productivity I.I8 IO58.I+O* .680 

Interaction (Type of 
teaching evaluation 
information X teaching 

ability) I.56 2.80 



*p < .001. 

iiote: i-'or all other interactions F < 1,0. 

Ijote: The index supplements the F values by providing an estimate of 
the proportion of the variance in the dependent variable accounted 
for by the independent variable (see Hays, 1965). 



18 



23 



Following the convention introduced in Table 9, the numbers abovn ihe 
diagonals are from the subject.s in the student- rating condition, and t.he num- 
bers below the diaijonalc are fr'^m the subjects in ♦-ho chairman 'c- report fondi- 
1 1 on . 

The rf:;ults present«?d thus far can h>- cuti.niarir-e'l quite hriofly. Firy<-, 
In none of nhe analyce;; did \,ypo of t>?a''ln M(5-f^valuat j on Information make 
a :jignif i'jant difference in the promotion de<vi:;ionG. Second, all of 1,he dal.a 
indicate a considerably larger emphasis on rect^arch productivity than on tea'*h- 
ing ability. This is obvioun in the relative magnitudes of the F values from 
the analyses of variance, and particularly in the yir.e of the w-- estimates. 
It is also apparent by inspecting Tables 9, 10, and 11, and comparing the range 
of values across research levels (i.e., f.he row rnpanij) with the range of values 
acrocc teaching levels (i.e., t-.he column means), or by comparing the means of 
the excellent re search- poor teaching combination with the means of the poor 
research-excellent teaching combination. All of these comparisons reveal that 
in making promotion decisions, in making salary- increase decisions, and in 
assigning rank-orders to promotion candidates, the subjects placed much more 
emphasis on research productivity thcin upon teaching ability. 

Additional Result; Sex and Department 

The results from the supplementary issues concerning the sex and academic 
department of the candidate are presented in Table 12. Although there were 
certain consistent trends in all three dependent measures, the absolute mag- 
nitudes of the differences in Table 12 are quite small, particularly in com- 
parison to the differences obtained across the various teaching and research 
levels in Tables 9, 10, and 11. Because of a mistake in implementing the 
design there was a partial confounding of sex and the particular combination 
of teaching ability and research productivity. This prevented the use of 
s»-.atistical tests on the data, and only weighted averages could be used to 
obtain the summary results in Table 12. 



TABLE 12 



AfALYSES OF CANDIDATE'S SEX AND ACADEMIC 
DEPARTMENT AS FACTORS INFLUENCING DECISIONS 



Percent 
Promotions 



Salary 
Increase 



Rank 
Order 



Males 
Females 



57.5 



$755.51* 
670.38 



5.75 
k.crr 



Academic 
Department 

Psychology 

i'hysics 



i.8.5 



858.98 
782. 15 



5.^45 
5.55 



24 



Thi} iat.a from tho question dealing with the most desirable combination of 
teaching excellence and research oxc<-*llence in tho final questionnaire admitii:> 
tered tn t>.' L5ub,1oct3 lend furth«*r support ^o the coficlur.ion that a much hen vie 
^:mpha::i:; i:: placed on re::enr' *h than on teaching In evaluating candidate:> foi- 
promotion. It v;ill be remomberod that the subjects v/ere instructed io outline 
a region .^n a two-dimensional graph that represented the most desirable r^ombina 
tions of teaching and resean»h excellence for a promotion candidate. These 
regions v;ere analyzed by computing the arear. of the outlined regions above the 
positive diagonal (i.e., those ret;;iouG im vliioh the emphasis on research excel- 
lence, scaled on the ordinate, is more than the ^-^mphasis on teaching excellence 
scaled on the abscissa), and the regions below the positive diagonal (i.e., 
where teaching excellence, scaled ori the abscissa, is emphasized more than re- 
search excellence, scaled on the ordinate) and dividing the area above the 
diagonal by the area below the diagonal. The ratio resulting from these com- 
putation j might then be considered a measure of relative research emphasis, in 
the abstract, since it is not dependent upon any particular case histories as 
were th<=^ other analyses reported earlier. The mean ratio for the l8 subjects 
completi"g this item in the questionnaire was 1.79 in favor of research. Nine 
of the sub.iects assigned equal weight to teaching excellence and research 
excelle-i^e; none placed greater emphasis on teaching than on research. 

A final datum from the post-experiment questionnaire was the subject *s 
estimates of the actual correlation betveen research productivity and teaching 
ability in current members of the faculty at The University of Michigan. The 
mean correlation from the 1? subjects reporting an estimate was +.^6, with a 
range from +.2'5 to +.7^. 



DISCUSSION 

Before discussing the results of the study, it is perhaps best to mention 
some of the limitations governing any interpretations. First, there is the 
very obvious limitation that the results cannot be generalized beyond a single 
university at a given point in time. Second, the case histories were perhaps 
too simplistic since they were constructed so as to vary along only two major 
dimensions. And third, there is no assurance that the different levels along 
the teaching-ability continuum are in any way equivalent to the different lev- 
els along the research-productivity continuum. That is, the difference bet- 
ween what ve have termed excellent and poor teachers may not have represented 
the oame difference as that between what we have termed excellent and poor 
researchers. 

With ^.hese limitations in mind, we can nonetheless still be optimistic 
about the potential of thi:; type of study. We were successful in obtaining 
^■houghtful re^presentati ve judgments from subjects of the target population 
v;ith which we were concerned, university promotion committee members. And, 
^.he materials were judged to be "generally realistic." Thus our methoui-Togy 
seens worthwhile for further studies* 



20 



26 



One of the major goals in the study was to determine whether the type of 
teaching-evaluation information provided in the promotion candidate's descrip- 
tion affected the decision regarding that candidate's promotion, salary increase, 
or relative rank in a "desirability-f or- promotion" scale. As the results of 
the analysis of variance on the salary data indicate, and ac is evident in Tables 
9, 10, and 11, the addition of the student- rating information apparently made 
little difference in any of the decisions. From the present results, we find 
little evidence that information from student ratings of teaching is utilized 
where decisions regarding promotions and salaries are made. 

It is interesting to speculate about the reasons for the failure to utilize 
the additional, and possibly more objective, information about the candidate's 
teaching. The most obvious interpretation is that the quality of the teaching 
was considered to be a very small factor in evaluating the candidates, and hence, 
the specific type of information used to assess the teaching did not really 
matter. 

A second possiflity for the apparent failure to utilize the student- rating 
information is that the subjects may not have believed that student ratings were 
an accurate, or valid, way to assess teaching quality. 

A third possibility is that the form in which the student ratings were pre- 
sented was not persuasive. We suspect that a combination of statistical summary 
and direct quotations would be more persuasive than the numbers alone. 

A fourth possibility is that teaching information is critical to decisions 
only when the candidate's research qualifications are not clearly excelxant or 
poor. In such cases teaching may tip the balance and the quality of information 
provided about teaching may be critical. With our basic methodology apparently 
established as adequate, we hope to investigate this possibility next. 

The secondary issues of interest in the present study, concerning the sex 
and academic department of the candidate, appeared not to be major factors in 
the decisions although a confounding with teaching research level precluded a 
formal statistical test. 

One of the clearest findings in the experiment was the marked emphasis on 
research productivity compared to teaching ability. That there is such an em- 
phasis on research is not surprising, but the current techniques allow reason- 
ably precise estimates of the relative emphasis on research compared to teaching. 
Indeed, so many estimates are available ( see Table 15) that one is faced with 
the problem of deciding which particular one is best. Fortunately, all of the 
estimates are reasonably similar, indicating approximately twice as much empha- 
sis on research compared to teaching. 



21 



26 



•P 

CO 



8 

CVJ 



o 

ON 



00 

(VJ 



UN 
H 



o 

O i(\ 



On 
r4 



•H 

(U 
EH 

O 



•H 

O 



r-4 



02 

o 

•P 

O 
6 
o 



o 



o 

O 
Eh 

O 

o 

Ph 
I 

•H 

;U 
O 

(1) 



(1) 

o 



CO 

o 
t 



03 
0) 

a' 



0) 

w 



o 



O 
X 

w 
I 

s: 
o 
u 

CO 

o 



•H 

^ 
O 
05 
<U 

Eh 

^ 
O 
O 

I 

o 
u 

05 
0) 
CO 
0) 

a: 

0) 
1-4 
H 
<D 
O 
X 



0) 
CO 
05 
0) 

u 
o 
a 

M 

u 

05 
rH 
05 
CO 



.a 
o 

05 



o 

CO 
<D 
CO 
0) 

CM 



< 

s: 

V 
05 
(I) 

+> 

<u 

H 



xi 
o 
u 

05 
0) 
CO 
<U 

o 



i 

Xi 
o 
u 

05 
0) 
CO 

-p 
a 

0) 
rH 

0) 
O 

X 

w 



0) 

< w 



I 





-< 




^ 


o 


Xi 


O 


05 




05 






<D 






EH 




















(I) 




<U 


rH 


0) 


H 




rH 


rH 


0) 


rH 


(I) 




0) 


O 






X 




X 


w 


w 






1 
























uj 






0) 


CO 




CO 


0) 




0) 






p:; 










0 


u 


o 


u 


0 




05 






0) 






CO 

















t)0 

a 

O 
05 

s> 

o 
I 

Xi 
o 

0) 
CO 
0) 

p^ 

0) 



o 



p^ 





0 








a 
























0) 






EH 


1 




a 


Xi 






0 


0 




^ 






05 




05 


0) 


1 




CO 


Xi 




0) 






p^ 




OJ 




05 




t 


0) 






CO 




0) 


0) 




rH 


p:; 




rH 






0) 




^ 




•s 


05 




0) 


0) 




rH 


CO 






0) 




0) 


P^ 












OJ 


< 


w 









CO 

+> 

•H 

0) 
•P 

o 

05 
U 

cd 



05 
U 
•H 

CO 

CO 
<iH 

o 

a 
o 

05 

Q) 
CO 
0) 

p^ 

O 

Xi 



CO 
•H 



•H 

Xi 
o 

05 



CO 
•H 
CO 
05 

I 

05 
0) 
CO 

(U 

« 



ERIC 



22 



27 



Perhaps the contribution of greatest import in the present study is the 
introductic« of a viable methodology for the investigation of decision making 
in academic institutions. It Is our hope thnt the research reported here will 
serve as an impetus to begin the systematic Live stigat ion of these decision 
processes. 



23 

23 



STUDY Hit DO DISCREPANCIES BETWEEN STUDENT RA'PINGS, TEACHER EXPECTATIONS, 
AND TEACHER IDEALS RESULT IN CHANGES IN TEACHER BEHAVIOR? 



INTRODUCTION 

• 

Much attention has been i'.iven to student ratin^^s in recent years. Much 
research has investigated whether or not student ratings of instruction are 
related to the effectiveness of teachers as determined by student achievement. 
Other studies have focused on the effect of student ratings on the instructor's 
behavior. The present study examined one aspect of this effect, specifically, 
do student ratings of teacher effectiveness have differential effects on in- 
structors whose own perceptions of their teachlnK ability were similar to 
their students' perceptions as compared to those whose perceptions were dis- 
crepant from the student ratings? 

Both Centra (ly72) and Pambookian (li/f'2) studied student ratings in ten.;? 
of such discrepancy. Centra found that teachers who were shown that the;» had 
unrealistically high opinions of their teaching changed the most in a positive 
direction. As this discrepancy, where students rated teachers less iavorably 
than the latter expected, increased there was an increased likelihood of 
teacher change. Similarly, when teachers rated themselves as average or poor 
and students' ratings concurred, teachers showed very little change despite 
their awareness of a need for improvement. Pambookiaji separated his teacher 
sample into three groups: (a) unfavorably discrepant teachers whose self- 
ratings were higher than their students' ratings; (b) minimally discrepant 
instructors whose own perceptions of their teaching ability were similar to 
their students' opinons; and (c) the favorably discrepant teachers whose stu- 
dents rated them as better than what the teachers themselves perceived of 
their abilities. As predicted, Pambookian found that across all dimensions 
on the student rating forms, the unfavorably discrepant group improved the 
most after seeing their students' ratings, followed by the minimally discrep- 
ant and then favorably discrepant groups. 

In the present study an additional factor was considered, that of the 
teacher's opinion of how he would like to teach, termed here as his "ideal." 
Thus teachers were divided among eight classification groups, as enumerated 
below: 

(a) Teacher expected and ideal ratings higher than students ' . 

(B) Teacher expected rating higher than student pre-test 
(student ratings given after the first five weeks of classes) 
and ideal at or below student pre-test. 

(C) No discrepancy between ideal, expected, and student 
pre-test ratings . 



2h 



(D) No discrepancy between teacher expected and student 
pre-test ratings, but teacher ideal rating higher. 

(E) No discrepancy between teacher expected and student 
pre-test ratings, but teacher ideal lower. 

(F) Teacher expected rating lower than student pre-test 
and ideal rating higher than student pre-test. 

(G) Teacher expected rating lower than student pre-test 
and ideal close to the student pre-test. 

(h) Teacher expected rating lower than student pre-test 
and ideal rating lower than student pre-test . 

On the basis of our analysis of research on effect of feedback on per- 
formance, we hypothesized that feedback results in greatest improvement when: 
(a) the knowledge of results gives new information to the learner; (b) the 
learner is motivated to improve; and (c) the learner knows what actions are 
necessary to imi»ove. In terms of the above-listed groups, we expected group 
A (corresponding roughly to Pambookian's unfavorably discrepant group) to 
show the most improvement as a result of the feedback and gz'oup H to show 
the most negative changes in performance. 



METHOD 



Sample 

The data were collected aj^' part of the larger study on the effect of the 
feedback of student ratings on a teacher's effectiveness. The sample consisted 
of 28 instructors of introductory psychology classes at The University of 
Michigan . 



Measures 

The measure used to assess teacher performance was the Michigan Student 
Perception of Teacher Form. It consists of 32 items distributed among seven 
dimensions: Impact (the intellectual effect of the teacher on the student). 
Rapport, Teacher -as -Person, Group Interaction, Difficulty, Structure, and 
Feedback. 

In addition, the instructors completed a form indicating their own per- 
ceptions of their teaching ability. For each of the seven dimensions, the 
teacher estimated whether he expected his students would rate him to be in 
the top IC^, above average, average, below average, or in the bottom 10^^ of 



25 30 



the sample. Instructors also indicated their "ideal" of where they would 
most like to be on each scale. 



Procedure 

The students of the teachers in the sample completed the student ralin/': 
form for the first time at approximately the one-third point in the semester. 
Before the instructors saw these results, they completed the teacher expecta- 
tion and ideal evaluation forms . 

Approximately two weeks before the end of the semester, the students 
reevaluated their teachers. The same rating form as in the pre-test was used. 

Mean change scores were computed between the pre-test and post-test rat- 
ings for each instructor on each dimension. In addition, discrepancy scores 
were computed between (a) the teacher's expectations and the actual ratings, 
and (b) the teacher's ideal compared to the actual ratings. Instructors were 
then distributed among the previously mentioned groups A through H. Not all 
of these groups emerged on every dimension (see Table Ih for group N's), but 
group E was the only one which was never represented on any of the seven 
dimensions . 



RESULTS AND DISCUSSION 

There was some support for our hypothesis that discrepancy between student 
pre-test ratings and teacher expected and ideal ratings would result in 
changes in teacher performance. Table ih presents the mean change scores for 
each group of teachers as well as a brief reiteration of the type of discrep- 
ancy that characterizes each group. Although no significant differences or 
patterns emerged regarding the change scores among the teacher groups for most 
dimensions, there were significant differences on the dimensions of Group 
Interaction and Feedback. 

Groups A and G showed the most change on Group Interaction after the pre- 
test ratings in contrast to groups D, F, and H who only changed slightly, the 
latter two changing in the negative direction. These results support our hy- 
pothesis that the teachers vhose expectations and ideal ratings were higher 
tnan the students' ratings (group A) would show the most improvement. Group 
G's change in behavior is not consistent with our expectation because neither 
their expectations nor their ideals where above the actual ratings. 

The mean change scores on the Feedback dimension were more dramatic. 
Groups A and F showed the most improvement, while group H changed markedly 
in the negative direction. This clearly supports our hypotheses. Groups A 
and F were both motivated to improve whereas the teachers in group H received 
ratings which were higher than both their expectations and ideals thus pro- 
viding no motivation to improve. Groups A and H correspond roughly to 



26 



f j 



ft 



5^ 





.id 




o 








.U 




T1 




a; 




a; 




fx 4 
















+•> 




o 




M 




i< 
























o 








Vi 








1 

( 1 






C 


o 






03 


P« «P 




3 O 




O 0) 




% ^ 












c 







^ o 
a; M 

CO 

05 



4^ 

O 

CO 



o 

CO 



r-4 +> 

o a 

to 4^ 

•H CO 



CO 

p OS 

o 



0) 

:•: A*: 



a; «rt 



CO cfi 



G ^ 

eO 

o 



a; 

CO 



1^ 

s: CO 
o 4> 



o 

CO 
4^ 
O 



f- 1 



On 



t 

CO 



t 



-a 

o 



01 



CM 



Of 



CVJ 



UN 



I 



00 

o 



Oi 



t 



00 



v, 



Ol 



U ^ 



y 
« 
I 



95 CO 



I 



o 



CVJ 

o 



i 



ON 



CVJ 
CVJ 



t 

o 



•1/ 

il 

It, 



It. 



i 

I.' 

KN 
I! 



n 



u 

^4 



On 



It 

Pt4 



W &4 c> 



4^ 

^1 



l> 
• O 

»^ .a 4^ 

Ot 4^ CO 
CO 



•rl 



•1 



00 
=5 



0< 
4^ 



4^ r-i 4^ 

O CO O 

o; (0 

^ 4^ 

cd CO 05 

x: G xi 

•p O 4^ 
•H 

^ 4^ ^ 

o; or a> 

x: 4^ xi 

bp o bo 

as 

t,o a; CO 

05 

rH X: r-l 

05 O 05 

o; 05 0) 

•H 4-* 'H 



05 



05 
4^ 



^ u ^ 

O (U o 

o +> o 

C (U c 
05 Xi 

4^ 

O >i o 

a; o 0) 
a; p< 0) 

^ CO 

o * 

CO %4 

.•ho; 

x: 'a x: 

o o 

05 r-< CO 

<i) ^ Q> 

4^ S 4^ 
•H 

(U C 0) 

.H x: 

4^8 4^ 

CO CO CO 

^ o ^ 

4^ 4^ 4^ 

CO OO CO 

o o o 

•H 'H ♦H 

G G G 
•H -H 

+ O I 



u 
o 



ERIC 



27 



32 



Pambookian's xinfavorably discrepant and favorably discrepant groups, respec 
tively. Thtis, their change scores on this dimension concur with Parabookian 
findings . 



33 



28 



REFERENCES 



Carrier, N. A. Evaluating the Introductory Psychology Course . Reading, Mass: 
Addlson-Wesley, 1966. 

Centra, J. A. Evaluating college teaching: The rhetoric and the research. 
(Paper presented at American Association for Higher Education, Chicago, 
Illinois, March, 19T2.) 

Gage, N. L., Runkel, P. L, and Chatter jee, B. B. Changing teacher behavior 
through feedback from pupils: An application of equilibrium theory. In 
W. W. Charters and N. L. Gage (Eds.). Readings in the social psychology of 
education . Boston: Allyn and Bacon, 1963, pp. 173-181. 

Hays, W. L. Statistics . New York: Holt, Rinehart, and Winston, 1965. 

Isaacson, R. L., MCKeachie, W. J., Milholland, J. E., Lin, Y. G., Hofeller, M., 
Baerwaldt, J. W., and Zinn, K. L. Dimensions of student evaluations of 
teaching. Journal of Educational Psychology . 196U, 55, 3^^-551. 

McKeachie, W. J., Lin, Y. G., and Mann, W. Student ratings of teacher effec- 
tiveness: Validity studies. American Educational Research Journal , 1S71, 
8, 435-4^+5. 

McKeachie, W. J. and Lin, Y. G. Multiple discriminant analysis of student 
ratings of college teachers. Unpublished paper. The University of Michigan, 
Psychology Department, 19T3' 

Milholland, J. E. Measuring cognitive abilities. In McKeachie, W. J., 
Isascson, R. L, and Milholl€Uid, J. E. Research on the Characteristics of 
Effective College Teaching . (Final report: Co-operative Research Project 
No. SAE 850, United States Office of Education and The University of Michigan.) 
Washington, D.C., U.S. Government Printing Office, I96I+. 

Miller, M. T. Instructor attitudes toward, and then use of student ratings 
of teachers. Journal of Educational Psychology , IS^l, 62, 235-239. 

Pambookian, H. S. The effect of feedback fJrom students to college instructors 
on their teaching behavior. Journal of Educational Psychology , in press. 

Tuckman, B. W. and Oliver, W. F. Effectiveness of feedback to teachers as 
function of source. Journal of Educational Psychology , I968, 59, 297-301. 



APPENDIX 



UNIVERSm OP MICHIGAN 
STUDENT PERCEPTION OF TEACHING 



PLEASE PUT YOUR STUDENT NUMBER, COURSE, NUMBER AND SECTION NUMBER ON THE 
ACCOMPANYING IBM FORM, AS WELL AS ON THIS FORM! 



Student No. 
Dat e 

Section 

Instructor 



Your Grade Point - U of M 

3,4 - 4:0 

2,9 - 3.3 

2.4 - 2.8 

Below 2.4 



First Semester Freshman 

High School Rank 

Top 5% 

Top 25% 

Below 25% 



PLEASE INDICATE ON THE ACCOMPANYING IBM FORM YOUR REACTION TO EACH OF THE 
FOLLOWING STATEMENTS. 



0 •* not applicable 

1 * almost never or almost nothing 

2 seldom or little 

3 «• occasionally or moderate 

4 •■ often or much 

5 - very often 

6 * almost always or a great deal 



WRITE IN AFTER THE QUESTION ANY COMMENTS THAT YOU WISH TO MAKE. GIVE EXAMPLES 
WHEREVER POSSIBLE. 



50 



00 



ERIC 



0 • not applicable 

1 « aXttost never pr elaott nothing 

2 « seldom or little 

3 • occasionally or moderate 

4 « often or much 

5 " very often 

6 « almost always or a great deal 



IMPACT OH STUDENTS 

The instructor stimulates my intellectual curiosity. 
Comments: 



I am learning how to think more clearly about the area of this course. 
Comments i 



I am learning how to read materials in this area more effectively. 



The instructor is effective in conveying the larger human context within which 
this subject lies. 
Comments: 



I am acquiring a good deal of knowledge about the subject. 
Connents : 



The course is making a significant contribution to my self-underst ending. 
Comments: 



31 



36 



0 not applicable 

1 «• almost never or almost nothing 

2 « seldom or little 

3 *> occasionally or moderate 

4 ■ often or much 

5 • very o£ten 

6 » almost always or a great deal 



7. The course Is Increasing my Interest in learning more about this area. 

Comments : 



8. I am generally bored In this class. 
If yes, vhy? 

* 



9. The Instructor is enthusiastic. 
Comments : 



10. The Instructor gives good examples of the concepts. 
Comments : 



11. The definitions and concepts given In class are generally clear. 
Comments : 



1.2. The Instructor goes Into too much detail. 
Comments : 



32 



37 




C ■ nc . Lppllcablti 

1 " almost never or alaost nothing 

2 " seldom or little 

3 » occasionally or moderate 
A « o£ten or much 

5 " very often 

6 ■ almost always or a great deal 



13. Students are confused. 
Comments : 



lA. The instructor is able to tell when students are confused. 
Comnents: 



15. The instructor is helpful when students are confused. 
Comments : 



16. The instructor seems kttpwledgeable in many areas beaideA psychology. 
Comments: 



RAPPORT 

17. The instructor is permissive. 
Covments: 



33 



38 



0 • not applicable 

1 « almost never or almost nothing 

2 <- seldom or little 

3 •* occasionally or moderate 
A « often or much 

5 ■ very often 

6 ■ almost always or a great deal 



18. The instructor Is friendly. 
Comments: 



19. The Instructor invites criticism of his/her acts. 
Comments: 



20« It is very easy to learn to trust the instructor. 
Comments : 



TEACHER AS PERSON 

21. The class is more pleasant than productive* 
Comments: 



22. The instructor spends so much time being "one of the gang"> that we don*t 
learn as much as we could. 
Comments : 



30 



ERIC 



0 • not applicable 

1 " almost never or alnost nothing 

2 » seldom or little 

3 *• occasionally or moderate 

4 often or much 

5 • very often 

6 ■* almost always or a great deal 



GROUP INTERACTION 

23. Students volunteer their own opinions. 
CoBOBents • 



24. Students argue with one another, (not necessarily with hostility). 
ConoMnts : 



25. Students feel free to argue with the Instructor. 
Coaments: 



DIFFICULTY 

26. The instructor assigns very difficult reading. 



27. The instructor asks for more than students caa get done In the time available. 

Conanents : 



55 



40 



0 » not applicable 

1 *• almost never or alffiost nothing 

2 « seldom or little 

3 «* occasionally or moderate 

4 • often or much 

5 •» very often 

6 « almost always or a great deal 



STRUCTURE 

28. The Instructor plans class activities in detail. 
Comments: 



29. The Instructor follows an outline closely. 
Comments: 



FEEDBACK 

30. Inatructor keeps- students informed of their progress. 



31. The Instructor tells students when they have cioue a particularly good job. 
Comments « 



32. Tests and papers are graded and returned promptly. 
Comments : 



4i 



36 

ERIC 



